Detecting Parts for Action Localization
نویسندگان
چکیده
In this paper, we propose a new framework for action localization that tracks people in videos and extracts full-body human tubes, i.e., spatio-temporal regions localizing actions, even in the case of occlusions or truncations. This is achieved by training a novel human part detector that scores visible parts while regressing full-body bounding boxes. The core of our method is a convolutional neural network which learns part proposals specific to certain body parts. These are then combined to detect people robustly in each frame. Our tracking algorithm connects the image detections temporally to extract fullbody human tubes. We apply our new tube extraction method on the problem of human action localization, on the popular JHMDB dataset, and a very recent challenging dataset DALY (Daily Action Localization in YouTube), showing state-of-the-art results.
منابع مشابه
Combined model for detecting, localizing, interpreting and recognizing faces
This work describes a method that combines face detection, localization, part interpretation and recognition, and which is capable of learning from very limited data, in a semi-supervised or even fully unsupervised manner. Current state-of-the-art techniques for face detection and recognition are subject to two major limitations: extensive training requirement, often demanding tens of thousands...
متن کاملFeasibility of detecting and localizing radioactive source using image processing and computational geometry algorithms
We consider the problem of finding the localization of radioactive source by using data from a digital camera. In other words, the camera could help us to detect the direction of radioactive rays radiation. Therefore, the outcome could be used to command a robot to move toward the true direction to achieve the source. The process of camera data is performed by using image processing and computa...
متن کاملInvertible chaotic fragile watermarking for robust image authentication.dvi
Fragile watermarking is a popular method for image authentication. In such schemes, a fragile signal that is sensitive to manipulations is embedded in the image, so that it becomes undetectable after any modification of the original work. Most algorithms focus either on the ability to retrieve the original work after watermark detection (invertibility) or on detecting which image parts have ∗Th...
متن کاملCompositional Structure Learning for Action Understanding
The focus of the action understanding literature has predominately been classification, however, there are many applications demanding richer action understanding such as mobile robotics and video search, with solutions to classification, localization and detection. In this paper, we propose a compositional model that leverages a new mid-level representation called compositional trajectories an...
متن کاملHuman Focused Action Localization in Video
We propose a novel human-centric approach to detect and localize human actions in challenging video data, such as Hollywood movies. Our goal is to localize actions in time through the video and spatially in each frame. We achieve this by first obtaining generic spatiotemporal human tracks and then detecting specific actions within these using a sliding window classifier. We make the following c...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1707.06005 شماره
صفحات -
تاریخ انتشار 2017